Mining Association Algorithm with Improved Threshold Based on ROC Analysis
نویسندگان
چکیده
The mining association algorithm is one of the most popular data mining algorithms to derive association rules at high speed from huge databases. We have been developing navigation systems for semi-structured data like as Web data and bibliographic data. To navigate beginners, our systems give the association rules derived by the algorithm. However; the algorithm tends to derive those rules that contain noises such as stopwords then many systems use noise filters to remove such noises. In order to remove the noises automatically and derive more effective rules, we proposed an algorithm using the true positive rate and the false positive rate of derived rules in a database based on the ROC analysis. In this paper; we make corrections in the parameters to improve the extended mining association algorithm. Moreover; we evaluate the pe~ormance of our proposed algorithm using a experimental database and show how our proposed algorithm can derive eflective association rules. We also show that our proposed algorithms can remove stopwords automatically from raw data.
منابع مشابه
Mining Association Algorithm with Threshold based on ROC Analysis
The mining association algorithm is one of the most important data mining algorithms to derive association rules at high speed from huge databases. However, the algorithm tends to derive those rules that contain noises such as stopwords then some systems remove the noises using noise filters. We have been improving the algorithm and developing navigation systems for semi-structured data using t...
متن کاملA new approach based on data envelopment analysis with double frontiers for ranking the discovered rules from data mining
Data envelopment analysis (DEA) is a relatively new data oriented approach to evaluate performance of a set of peer entities called decision-making units (DMUs) that convert multiple inputs into multiple outputs. Within a relative limited period, DEA has been converted into a strong quantitative and analytical tool to measure and evaluate performance. In an article written by Toloo et al. (2009...
متن کاملA Survey on Association Rule Mining Using Apriori Based Algorithm and Hash Based Methods
Association rule mining is the most important technique in the field of data mining. The main task of association rule mining is to mine association rules by using minimum support thresholds decided by the user, to find the frequent patterns. Above all, most important is research on increment association rules mining. The Apriori algorithm is a classical algorithm in mining association rules. T...
متن کاملIntroducing an algorithm for use to hide sensitive association rules through perturb technique
Due to the rapid growth of data mining technology, obtaining private data on users through this technology becomes easier. Association Rules Mining is one of the data mining techniques to extract useful patterns in the form of association rules. One of the main problems in applying this technique on databases is the disclosure of sensitive data by endangering security and privacy. Hiding the as...
متن کاملInvestigating the Effect of Land Use and Soil’s Physio-chemical properties on Wind Erosion Threshold Velocities via Data Mining
Introduction: Wind erosion is a phenomenon that causes severe environmental changes in arid and semi-arid climates. As surface soil texture is very effective in soil erodibility, identifying soil erodibility index is important and efficient. Mismanagement greatly contributes to the development of wind erosion. The velocity that makes the first particles of soil move from the surface is called t...
متن کامل